{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# nuScenes prediction tutorial\n",
    "<img src=\"https://www.nuscenes.org/public/tutorials/trajectory.gif\" width=\"300\" align=\"left\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook serves as an introduction to the new functionality added to the nuScenes devkit for the prediction challenge.\n",
    "\n",
    "It is organized into the following sections:\n",
    "\n",
    "1. Data splits for the challenge\n",
    "2. Getting past and future data for an agent \n",
    "3. Changes to the Map API\n",
    "4. Overview of input representation\n",
    "5. Model implementations\n",
    "6. Making a submission to the challenge"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nuscenes import NuScenes\n",
    "\n",
    "# This is the path where you stored your copy of the nuScenes dataset.\n",
    "DATAROOT = '/data/sets/nuscenes'\n",
    "\n",
    "nuscenes = NuScenes('v1.0-mini', dataroot=DATAROOT)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Data Splits for the Prediction Challenge\n",
    "\n",
    "This section assumes basic familiarity with the nuScenes [schema](https://www.nuscenes.org/nuscenes#data-format).\n",
    "\n",
    "The goal of the nuScenes prediction challenge is to predict the future location of agents in the nuScenes dataset. Agents are indexed by an instance token and a sample token. To get a list of agents in the train and val split of the challenge, we provide a function called `get_prediction_challenge_split`.\n",
    "\n",
    "The get_prediction_challenge_split function returns a list of strings of the form {instance_token}_{sample_token}. In the next section, we show how to use an instance token and sample token to query data for the prediction challenge."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nuscenes.eval.prediction.splits import get_prediction_challenge_split\n",
    "mini_train = get_prediction_challenge_split(\"mini_train\", dataroot=DATAROOT)\n",
    "mini_train[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Getting past and future data for an agent\n",
    "\n",
    "We provide a class called `PredictHelper` that provides methods for querying past and future data for an agent. This class is instantiated by wrapping an instance of the `NuScenes` class. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nuscenes.prediction import PredictHelper\n",
    "helper = PredictHelper(nuscenes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get the data for an agent at a particular point in time, use the `get_sample_annotation` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "instance_token, sample_token = mini_train[0].split(\"_\")\n",
    "annotation = helper.get_sample_annotation(instance_token, sample_token)\n",
    "annotation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get the future/past of an agent, use the `get_past_for_agent`/`get_future_for_agent` methods. If the `in_agent_frame` parameter is set to true, the coordinates will be in the agent's local coordinate frame. Otherwise, they will be in the global frame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "future_xy_local = helper.get_future_for_agent(instance_token, sample_token, seconds=3, in_agent_frame=True)\n",
    "future_xy_local"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The agent's coordinate frame is centered on the agent's current location and the agent's heading is aligned with the positive y axis. For example, the last coordinate in `future_xy_local` corresponds to a location 0.31 meters to the left and 9.67 meters in front of the agents starting location."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "future_xy_global = helper.get_future_for_agent(instance_token, sample_token, seconds=3, in_agent_frame=False)\n",
    "future_xy_global"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that you can also return the entire annotation record by passing `just_xy=False`. However in this case, `in_agent_frame` is not taken into account."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "helper.get_future_for_agent(instance_token, sample_token, seconds=3, in_agent_frame=True, just_xy=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you would like to return the data for the entire sample, as opposed to one agent in the sample, you can use the `get_annotations_for_sample` method. This will return a list of records for each annotated agent in the sample."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sample = helper.get_annotations_for_sample(sample_token)\n",
    "len(sample)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that there are `get_future_for_sample` and `get_past_for_sample` methods that are analogous to the `get_future_for_agent` and `get_past_for_agent` methods.\n",
    "\n",
    "We also provide methods to compute the velocity, acceleration, and heading change rate of an agent at a given point in time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# We get new instance and sample tokens because these methods require computing the difference between records.\n",
    "instance_token_2, sample_token_2 = mini_train[5].split(\"_\")\n",
    "\n",
    "# Meters / second.\n",
    "print(f\"Velocity: {helper.get_velocity_for_agent(instance_token_2, sample_token_2)}\\n\")\n",
    "\n",
    "# Meters / second^2.\n",
    "print(f\"Acceleration: {helper.get_acceleration_for_agent(instance_token_2, sample_token_2)}\\n\")\n",
    "\n",
    "# Radians / second.\n",
    "print(f\"Heading Change Rate: {helper.get_heading_change_rate_for_agent(instance_token_2, sample_token_2)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Changes to the Map API\n",
    "\n",
    "We've added a couple of methods to the Map API to help query lane center line information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nuscenes.map_expansion.map_api import NuScenesMap\n",
    "nusc_map = NuScenesMap(map_name='singapore-onenorth', dataroot=DATAROOT)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get the closest lane to a location, use the `get_closest_lane` method. To see the internal data representation of the lane, use the `get_lane_record` method. \n",
    "You can also explore the connectivity of the lanes, with the `get_outgoing_lanes` and `get_incoming_lane` methods."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x, y, yaw = 395, 1095, 0\n",
    "closest_lane = nusc_map.get_closest_lane(x, y, radius=2)\n",
    "closest_lane"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lane_record = nusc_map.get_arcline_path(closest_lane)\n",
    "lane_record"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nusc_map.get_incoming_lane_ids(closest_lane)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nusc_map.get_outgoing_lane_ids(closest_lane)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To help manipulate the lanes, we've added an `arcline_path_utils` module. For example, something you might want to do is discretize a lane into a sequence of poses."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nuscenes.map_expansion import arcline_path_utils\n",
    "poses = arcline_path_utils.discretize_lane(lane_record, resolution_meters=1)\n",
    "poses"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given a query pose, you can also find the closest pose on a lane."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "closest_pose_on_lane, distance_along_lane = arcline_path_utils.project_pose_to_lane((x, y, yaw), lane_record)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(x, y, yaw)\n",
    "closest_pose_on_lane"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Meters.\n",
    "distance_along_lane"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To find the entire length of the lane, you can use the `length_of_lane` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "arcline_path_utils.length_of_lane(lane_record)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also compute the curvature of a lane at a given distance along the lane."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 0 means it is a straight lane.\n",
    "arcline_path_utils.get_curvature_at_distance_along_lane(distance_along_lane, lane_record)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Input Representation\n",
    "\n",
    "It is common in the prediction literature to represent the state of an agent as a tensor containing information about the semantic map (such as the drivable area and walkways), as well the past locations of surrounding agents.\n",
    "\n",
    "Each paper in the field chooses to represent the input in a slightly different way. For example, [CoverNet](https://arxiv.org/pdf/1911.10298.pdf) and [MTP](https://arxiv.org/pdf/1808.05819.pdf) choose to rasterize the map information and agent locations into a three channel RGB image. But [Rules of the Road](http://openaccess.thecvf.com/content_CVPR_2019/papers/Hong_Rules_of_the_Road_Predicting_Driving_Behavior_With_a_Convolutional_CVPR_2019_paper.pdf) decides to use a \"taller\" tensor with information represented in different channels.\n",
    "\n",
    "We provide a module called `input_representation` that is meant to make it easy for you to define your own input representation. In short, you need to define your own `StaticLayerRepresentation`, `AgentRepresentation`, and `Combinator`.\n",
    "\n",
    "The `StaticLayerRepresentation` controls how the static map information is represented. The `AgentRepresentation` controls how the locations of the agents in the scene are represented. The `Combinator` controls how these two sources of information are combined into a single tensor.\n",
    "\n",
    "For more information, consult `input_representation/interface.py`.\n",
    "\n",
    "To help get you started, we've provided implementations of input representation used in CoverNet and MTP."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "from nuscenes.prediction.input_representation.static_layers import StaticLayerRasterizer\n",
    "from nuscenes.prediction.input_representation.agents import AgentBoxesWithFadedHistory\n",
    "from nuscenes.prediction.input_representation.interface import InputRepresentation\n",
    "from nuscenes.prediction.input_representation.combinators import Rasterizer\n",
    "\n",
    "static_layer_rasterizer = StaticLayerRasterizer(helper)\n",
    "agent_rasterizer = AgentBoxesWithFadedHistory(helper, seconds_of_history=1)\n",
    "mtp_input_representation = InputRepresentation(static_layer_rasterizer, agent_rasterizer, Rasterizer())\n",
    "\n",
    "instance_token_img, sample_token_img = 'bc38961ca0ac4b14ab90e547ba79fbb6', '7626dde27d604ac28a0240bdd54eba7a'\n",
    "anns = [ann for ann in nuscenes.sample_annotation if ann['instance_token'] == instance_token_img]\n",
    "img = mtp_input_representation.make_input_representation(instance_token_img, sample_token_img)\n",
    "\n",
    "plt.imshow(img)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model Implementations\n",
    "\n",
    "We've provided PyTorch implementations for CoverNet and MTP. Below we show, how to make predictions on the previously created input representation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nuscenes.prediction.models.backbone import ResNetBackbone\n",
    "from nuscenes.prediction.models.mtp import MTP\n",
    "from nuscenes.prediction.models.covernet import CoverNet\n",
    "import torch"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Both models take a CNN backbone as a parameter. We've provided wrappers for ResNet and MobileNet v2. In this example, we'll use ResNet50."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "backbone = ResNetBackbone('resnet50')\n",
    "mtp = MTP(backbone, num_modes=2)\n",
    "\n",
    "# Note that the value of num_modes depends on the size of the lattice used for CoverNet.\n",
    "covernet = CoverNet(backbone, num_modes=64)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The second input is a tensor containing the velocity, acceleration, and heading change rate for the agent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "agent_state_vector = torch.Tensor([[helper.get_velocity_for_agent(instance_token_img, sample_token_img),\n",
    "                                    helper.get_acceleration_for_agent(instance_token_img, sample_token_img),\n",
    "                                    helper.get_heading_change_rate_for_agent(instance_token_img, sample_token_img)]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "image_tensor = torch.Tensor(img).permute(2, 0, 1).unsqueeze(0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Output has 50 entries.\n",
    "# The first 24 are x,y coordinates (in the agent frame) over the next 6 seconds at 2 Hz for the first mode.\n",
    "# The second 24 are the x,y coordinates for the second mode.\n",
    "# The last 2 are the logits of the mode probabilities\n",
    "mtp(image_tensor, agent_state_vector)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# CoverNet outputs a probability distribution over the trajectory set.\n",
    "# These are the logits of the probabilities\n",
    "logits = covernet(image_tensor, agent_state_vector)\n",
    "print(logits)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The CoverNet model outputs a probability distribution over a set of trajectories. To be able to interpret the predictions, and perform inference with CoverNet, you need to download the trajectory sets from the nuscenes website. Download them from this [link](https://www.nuscenes.org/public/nuscenes-prediction-challenge-trajectory-sets.zip) and unzip them in a directory of your choice.\n",
    "\n",
    "Uncomment the following code when you do so:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#import pickle\n",
    "\n",
    "# Epsilon is the amount of coverage in the set, \n",
    "# i.e. a real world trajectory is at most 8 meters from a trajectory in this set\n",
    "# We released the set for epsilon = 2, 4, 8. Consult the paper for more information\n",
    "# on how this set was created\n",
    "\n",
    "#PATH_TO_EPSILON_8_SET = \"\"\n",
    "#trajectories = pickle.load(open(PATH_TO_EPSILON_8_SET, 'rb'))\n",
    "\n",
    "# Saved them as a list of lists\n",
    "#trajectories = torch.Tensor(trajectories)\n",
    "\n",
    "# Print 5 most likely predictions\n",
    "#trajectories[logits.argsort(descending=True)[:5]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also provide two physics-based models - A constant velocity and heading model and a physics oracle. The physics oracle estimates the future trajectory of the agent with several physics based models and chooses the one that is closest to the ground truth. It represents the best performance a purely physics based model could achieve on the dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nuscenes.prediction.models.physics import ConstantVelocityHeading, PhysicsOracle\n",
    "\n",
    "cv_model = ConstantVelocityHeading(sec_from_now=6, helper=helper)\n",
    "physics_oracle = PhysicsOracle(sec_from_now=6, helper=helper)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The physics models can be called as functions. They take as input a string of the instance and sample token of the agent concatenated with an underscore (\"_\").\n",
    "\n",
    "The output is a `Prediction` data type. The `Prediction` data type stores the predicted trajectories and their associated probabilities for the agent. We'll go over the `Prediction` type in greater detail in the next section."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cv_model(f\"{instance_token_img}_{sample_token_img}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "physics_oracle(f\"{instance_token_img}_{sample_token_img}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Making a submission to the challenge\n",
    "\n",
    "Participants must submit a zipped json file containing serialized `Predictions` for each agent in the validation set.\n",
    "\n",
    "The previous section introduced the `Prediction` data type. In this section, we explain the format in greater detail. \n",
    "\n",
    "A `Prediction` consists of four fields:\n",
    "\n",
    "1. instance: The instance token for the agent.\n",
    "2. sample: The sample token for the agent.\n",
    "3. prediction: Prediction from model. A prediction can consist of up to 25 proposed trajectories. This field must be a numpy array with three dimensions (number of trajectories (also called modes), number of timesteps, 2).\n",
    "4. probabilities: The probability corresponding to each predicted mode. This is a numpy array with shape `(number_of_modes,)`.\n",
    "\n",
    "You will get an error if any of these conditions are violated."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from nuscenes.eval.prediction.data_classes import Prediction\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This would raise an error because instance is not a string.\n",
    "\n",
    "#Prediction(instance=1, sample=sample_token_img,\n",
    "#           prediction=np.ones((1, 12, 2)), probabilities=np.array([1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This would raise an error because sample is not a string.\n",
    "\n",
    "#Prediction(instance=instance_token_img, sample=2,\n",
    "#           prediction=np.ones((1, 12, 2)), probabilities=np.array([1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This would raise an error because prediction is not a numpy array.\n",
    "\n",
    "#Prediction(instance=instance_token_img, sample=sample_token_img,\n",
    "#           prediction=np.ones((1, 12, 2)).tolist(), probabilities=np.array([1]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This would throw an error because probabilities is not a numpy array. Uncomment to see.\n",
    "\n",
    "#Prediction(instance=instance_token_img, sample=sample_token_img,\n",
    "#           prediction=np.ones((1, 12, 2)), probabilities=[0.3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This would throw an error because there are more than 25 predicted modes. Uncomment to see.\n",
    "\n",
    "#Prediction(instance=instance_token_img, sample=sample_token_img,\n",
    "#           prediction=np.ones((30, 12, 2)), probabilities=np.array([1/30]*30))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This would throw an error because the number of predictions and probabilities don't match. Uncomment to see.\n",
    "\n",
    "#Prediction(instance=instance_token_img, sample=sample_token_img,\n",
    "           #prediction=np.ones((13, 12, 2)), probabilities=np.array([1/12]*12))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To make a submission to the challenge, store your model predictions in a python list and save it to json. Then, upload a zipped version of your file to the eval server. \n",
    "\n",
    "For an example, see `eval/prediction/baseline_model_inference.py`"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
